repeated inverse reinforcement learning
Repeated Inverse Reinforcement Learning
We introduce a novel repeated Inverse Reinforcement Learning problem: the agent has to act on behalf of a human in a sequence of tasks and wishes to minimize the number of tasks that it surprises the human by acting suboptimally with respect to how the human would have acted. Each time the human is surprised, the agent is provided a demonstration of the desired behavior by the human. We formalize this problem, including how the sequence of tasks is chosen, in a few different ways and provide some foundational results.
Reviews: Repeated Inverse Reinforcement Learning
The authors present a learning framework for inverse reinforcement learning wherein an agent provides policies for a variety of related tasks and a human determines whether or not the produced policies are acceptable or not. They present algorithms for learning a human's latent reward function over the tasks, and they give upper and lower bounds on the performance of the algorithms. They also address the setting where an agent's is "corrected" as it executes trajectories. This is a comprehensive theoretical treatment of a new conceptualization of IRL that I think is valuable. I have broad clarification/scoping questions and a few minor points.
Repeated Inverse Reinforcement Learning
Amin, Kareem, Jiang, Nan, Singh, Satinder
We introduce a novel repeated Inverse Reinforcement Learning problem: the agent has to act on behalf of a human in a sequence of tasks and wishes to minimize the number of tasks that it surprises the human by acting suboptimally with respect to how the human would have acted. Each time the human is surprised, the agent is provided a demonstration of the desired behavior by the human. We formalize this problem, including how the sequence of tasks is chosen, in a few different ways and provide some foundational results. Papers published at the Neural Information Processing Systems Conference.